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Abstract 



C/3 ' We study the extremality of the BEC and the BSC for Gallager's reliability function Eq evaluated 

O 

under the uniform input distribution for binary input DMCs from the aspect of channel polarization. In 
particular, we show that amongst all B-DMCs of a given Eq{p) value, for a fixed p > 0, the BEC and 
QQ . BSC are extremal in the evolution of Eq under the one-step polarization transformations. 
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I. Introduction 



X 

, While the capacity of a memoryless channel W gives the largest rate that may be communi- 

cated reliably across it, the reliability function E{R, W) provides a finer measure on the quality 
of the channel: for any rate R less than channel capacity, it is possible to find a sequence of codes 
of increasing blocklength, each of which of rate at least R, and whose block error probability 
decays exponentially to zero in the blocklength — E(R, W) is the largest possible rate of this 
decay. 



The material in this paper was presented in part at the IEEE International Symposium on Information Theory, Boston, USA, 
luly 2012. 



Gallager classical treatise [1] gives a lower bound to E{R, W), the random coding exponent 
Er{R,W) in the form Er{R,W) = maxpg^^i] Eq{p,W) — pR. Remarkably, this lower bound 
is tight for rates above the critical rate Eq{1,W). The function Eq{p,W) that appears as an 
auxiliary function on the road to deriving Er{R, W) turns out to be of independent interest in 
its own right. In particular, Eq{p, W)/p is the largest rate for which a sequential decoder can 
operate while keeping the p-th moment of the decoder's computation effort per symbol bounded. 

In [2], we investigated the extremal properties of Eq{p, W) evaluated under the uniform input 
distribution for the class of binary input channels. We have shown that among all such channels 
with a given value of Eq{pi,W), for pi G [0,1], the binary erasure channel (BEC) and the 
binary symmetric channel (BSC) distinguish themselves in certain ways: they have, respectively, 
the largest and smallest value of Eq{p2, W) for any p2 e [pi, 1]. Furthermore, we showed that 
amongst channels W with a given value of -E'o(p, W) for a given p e [0, 1], the BEC and BSC 
are the most and least polarizing under Ankan's polar transformations in the sense that their 
polar transforms and W~ have the largest and smallest difference in their Eq values. 

In this paper, we extend the result related to the BEC and BSC being extremal for Ankan's 
polarization transforms to the region where p > 0. In his award winning paper [3], Ankan 
describes two synthetic channels W~^, and which can be obtained from two independent 
copies of H^. It is well known (proved as a corollary to extremes of information combining) that 
among all channels W with a given symmetric capacity I{W), the BEC and BSC polarize most 
and least in the sense of having the largest and smallest difference between I{W'^) and I{W~). 
We report a more general conclusion: amongst all channels W with a given value of Eo{p, W), 
the BEC and BSC polarize most and least in the sense of having the largest difference between 
Eo{p,W+) and Eo{p,W-) whenever p e [0,1] U [2,oo]. On the other hand, for Vp G [1,2], 
we show that the BEC maximizes, and the BSC minimizes the Eq values obtained after both 
applying the W'^, or the W~ transformations. 

A. Definitions 

Given a binary input channel W, let £'o(p, W) denote "Gallager's £"0" [1, p. 138] evaluated 
for the uniform input distribution: 




(1) 



Theorem 5.6.3 in [1] summarizes the properties of Eo{p, W) with respect to the variable p. 
For p > 0, Eq(p, W) is a positive, concave increasing function in p. Moreover, the symmetric 
capacity I{W) of the channel can be derived from Eo(p, W) by 

P->o p dp p=o 

and the Bhattacharyya parameter Z{W) from the cut-off rate as 

^»<'-^' = '°«TtI(F)^ 

The next lemma due to Telatar and Ankan [4] introduces a useful representation for the 
Eq{p, W) parameter. 

Lemma 1: [4] Given a symmetric B-DMC W, and a fixed p e [0, 1], there exist a random 
variable Z taking values in the [0, 1] interval such that 

Eo{p,W)^-\og¥.[g{p,Z)] (4) 

where 

1 1 ' ^'^^ 



Moreover, the random variable Zbec of a binary erasure channel is {0, 1} valued. The random 
variable Zbsc of a binary symmetric channel is a constant ^bsc- 



Proof: Recall Eo(p, W) = -log^ 
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so that W{y \ 0) = gvi^(y)[l + Aiy(|/)] and W{y \ 1) = qw{y)\^ — ^w{y)]- Then, one can define 
the random variable Z = \AwiY)\ E [0, 1] where Y has the probability distribution qwiv), and 
obtain (4) by simple manipulations. The claims about Zbec> and Zbsc are verified easily from 
(6). ■ 

II. EXTREMALITY RESULTS FOR THE POLARIZATION TRANSFORMATIONS 

A. Basic Polarization Transformations 

In [3], a low complexity code construction that achieves the symmetric capacity of B-DMCs is 
given based on the recursive application of two basic channel transformations. These transforms, 
usually refered as the minus and plus transformations, synthesize two new channels by combining 



two independent copies of a given channel. The transition probabiUties of the new channels are 
defined in terms of the initial one by the definitions given in [3, Eqs. (19), (20)]. 

Instead of identical copies of a given channel, we propose to combine two independent copies 
of different B-DMCs in a similar way. We denote by W{'2 : X ^ y'^ and W^2 ■ X ^ y"^ x X 
the synthesized channels obtained by combining independent copies of the channels Wi and W2. 
In this case, the transition probabilities can be defined by 

W^iT2(z/iy2 \ui)= ^Wi{yi I ui © U2)W2{y2 I U2) (7) 
1 

W^,2{yiy2Ui I U2) = -Wi{yi I ui © U2)W2{y2 \ U2). (8) 

The following two lemmas express the Eq parameter of the synthesized channels VF{^2> and 

in terms of the representation given in Lemma 1, relating them to the Eq parameters of the 

channels Wi and W2. 

Lemma 2: Given two B-DMCs Wi, W2, and p > 0, let Zi and Z2 be independent RVs such 
that 

Eo{p, W,) = - log E [g{p, Zi)] and Eo{p, W2) = - log E [g{p, Z2)] 
hold as defined in Lemma 1. Then, 

^o(p, W^iTs) = - log E [g{p, Z1Z2)] (9) 

where g{p, z) is given by (5). 



Proof: From the definition of the channel ^ (^)' we can write 



^o(p,w^i:2) = -iog J] 

2/1,2/2 
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oW^iT2(yi,y2 1 0)^ + -w^i;2(yi,y2 1 1)^ 
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^ I 0)W^2(y2 I 0) + I l)W^2(y2 I 1)^ 



+ 
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-log J] 
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I l)VF2(?/2 I 0) + I 0)W^2(|/2 I 1) 



2 V2 



?i^i (yi) 5^2 (^2) ^+'' 

((1 + Ah., (yi)) (1 + Ah., (^2)) + (1 - A^., (yi)) (1 - A^., (^2)))^ 
+ ((1 - Ah^, (yO) (1 + A^, (1/2)) + (1 + ^w, (2/1)) (1 - Ah^, (y2))) ^ 

-log^ 5(7/2) 



i+p 



2/12/2 



i(l + Ah',(|/i)Ah^,(|/2)) + ^(1 - Ah., (yi) Ah., (1/2)) 



where we used the definitions in (6). We can now define Z\ = |AH.i(yi)| and Z2 = |Ah'2(^2)| 
where Yi and Y2 are independent random variables with distribution qw^ and qw2, respectively. 
From this construction, the lemma follows. ■ 
Lemma 3: Given two B-DMCs Wi, W2, and p > 0, let Zi and Z2 be as in Lemma 2. Then, 



Eo{p,W+2) = -iogE 
where g{p, z) is given by (5) 



0) 



Proof: From the definition of channel W'^ in (8), we can write 



yi,y2,u 



= -iogE 
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Using (6), we have 

= -log^ 9 QwM qw2{y2) 
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((1 + AwM) (1 + AwM)) + ((1 - AwAvi)) (1 - Av^,(y2))) 



((1 - A^,(yi)) (1 + AwM)) + ((1 - AwM) (1 + Aw^,(y2))) 
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= -logf ^ ^ gvKi(2/i) ^^2(2/2) (1 + Aw/i(2/i)Ah^2(2/2)) 
\ 2/12/2 



2V 1 + Ah., (2/1) Ah/, (2/2); 2^ l + Aw,{yi)AwM, 



+ J ?m(2/l) ^1^2(2/2) (1 - Ah.,(7/i)Ai^2(2/2)) 



2/12/2 



1 / AH/,(yi)-AH-2(y2) A , 1 A _ Ap^,(t/i)-AH/,(i/2) A 
2V l-AwMAwMJ 2\ l-AwAyi)^wMj 

+ E ^ ^'^^(^O ^H/.(l/2) (l-AH^,(yOAH/.(|/2)) 9{p,^^^^^^^^) 
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where g{p, z) is defined in (5). 



Similar to the Eq{p,W^2) case, we define Zi = \A\Yj^{Yi)\ and Z2 = \Aw.^{Y2)\ where Yi 
and Y2 are independent random variables with distributions and qw2, respectively. However, 
we should check whether this construction is equivalent to the above equation. We note that 
A e [—1, 1]. When Aw^{yi) and Aw^{y2) are of the same sign, we can easily see (noting that 
g{p, z) is symmetric about z — 0) that 

When Aw{yi) and Aw{y2) are of the opposite sign, we note that 

(1 - A„.,(,OA„,,te)) = (1 + ZA) 9[f>, j^^J 

Since we are interested in the sum of the above two parts, we can see that the construction we 
propose is still equivalent. This concludes the proof. ■ 

Remark 1: By the symmetry of the RVs Zi and Z2, we have Eq{p, W^^2) = -^o(p, ^^2^1)- 
Lemma 4: The channels H^i~2> Wi, W2, and Wi^2 satisfy the following ordering: 

Eo{p, W,-2) < Eo{P, Wi) < Eoip, (11) 
Eo{p, W,-2) < Eo{p, W2) < Eo{p, )■ 

Proof: We only show the inequalities in (11) for the channel Wi. The proof for the channel 

W2 follows from Remark 1. By Lemmas 1, 2, and 3 the inequalities in (11) are equivalent to 

" 1- . r z, + Z2^ , 1,. r l<E[,(p,zo], (12) 



E 



E[^(p,Zi)] <E[^(p,ZiZ2)]. (13) 

By Lemma 7, the function g{p, z) is non-increasing in the variable z when p > 0. Hence, the 
second inequality in (13) holds. On the other side, note that for any realizations Zi and Z2, the 



factors 2 ~'~ ^1^2), and 2 ~ -2^1-2^2) form a distribution. As the function g{p, z) is concave in 
z by Lemma 7, we can apply Jensen's inequality to obtain 

Taking the expectation of both sides, we get the first inequality in (12). ■ 
Remark 2: In [5] it is shown that the channels Wi, W2, W{2', and satisfy the relationship: 

^o(p, W+2) + ^o(p, W{^2) > Eoip, W,) + Eoip, W2), Vp > 0. 

B. Extremality for the Basic Channel Transformations 

Theorem 1: Given two B-DMCs Wi, and W2, for any fixed value of p > 0, we define two 
binary symmetric channels Wqsc, and W-^, and two binary erasure channels VFbec, and Wglc 
through the equalities 

Eo{p,W,) = Eo{p,W^Ec) = £^o(p, W^BSc), (14) 
Eo{p, W2) = Eo{p, W%Ec) = ^o(p, W%sc)- (15) 
Then for the W^^2 polar transformation, we have 

For the polar transformation, we have 

^b'scbsc) ^ ^i) ^ ^o(p, W^BK,BEc) ^ p G [0, 1] U [2, oo] , (17) 

Eo{p.W^^^^)<E,{p,Wt2)<E,{p,W^^^^) Vpe[l,2]. (18) 

Proof: We start to show the result for the minus transformation given in Equation (16). 
This proof relies on the convexity result stated in the next lemma. The proof of the lemma is 
given in Appendix A. 

Lemma 5: For any z G [0, 1], and p > 0, the function Fz^p{t) : [2~^, 1] — )■ [g{p,z), 1] defined 

as 

F,,p{t)=g{p,zg-\p,t)) (19) 

where g~^{p,t) denotes the inverse of the function g with respect to its second argument, is 
convex with respect to the variable t. 



From Lemmas 1, and 2, we know that 

exp{-£;o(p,^i)} = Eb(p,^i)] 
exp{-£;o(p,iy2)} = %(p,^2)] 
exp{-Eo(p, W,-^)} = E[g{p, Z^Z^)] 

where Zi and Z2 are independent random variables. We also know ^bsc — ^bsc> -^bsc — -^bsc 
and Zbec, ^bec ^ {0' !}• Hence, 

exp{-£;o(p, M^BscBSc)^ ^ f ^bsc^bsc)- 
Given Eq{p, Wi) = Eo{p, Wbsc), and Eo{p, W2) = Eo{p, W^gc) we also have 

^[9{p,Zi)] = g{p,ZBsc), 

E[«/(p,^2)] = ff(p,^BSc)- 
Therefore, using Jensen's inequality we obtain 

exp{-Eo{p,W,-2)} = EzA^z,[Fz„pi9{P:Z2)) \ Z, = zi]] 

>EzAFz,,p{EzA9iP,Z2m 

= EzjFz„p(^(p,^BSc))] 
^^Ez,[F,_,{g{p,Z,))] 

>F,_,{EzA9{p,Zi)]) 

= Fz^,p {gip: ZbSc)) 

- eM-Eo{p,W-^^^^)} 

where (1) follows by symmetry of the variables Zi and z^^. 

Let e, and e be the erasure probabilities of Wbec, and W^^, respectively. Then, we have 
i"(^BEC = 0) = e, P{Z^ = 0) = e, and 

exp{-So(p, W^BEc)} = P(^BEC = 0)(1 - 2-'') + 2-^ 

exp{-So(p, V%Ec)} = i^(^BEC = 0)(1 - 2-^) + 2-". 

The channel a BEC with erasure probability e + e — ee, hence we get 

exp{-£;o(p, W^BECBEc)) = [^(^BEC = 0) + P{Z^ = 0) - P(Zbec = 0)P{Zbec - 0)] (l-2-'')+2-''. 



Therefore, given Eo{p, Wi) = Eo{p, Wbec), and Eo{p, W2) = Eo{p, W^), we have 
E [g{p, Zi)] = E [g{p, Zbec)] = P(^bec = 0)(1 - 2"") + 2"^ 
E [^(p, Z,)] = E [^(p, Zbec)] = ^(^BEC = 0)(1 - 2-0 + 2-'^. 

Due to convexity, we also know the following inequality holds: 

< (1 - t)F,,,(0) + tF^M) = 1 + - 1). 



2-p - 1 



Therefore, 

exp{-Eo{p,Wr,2)} 

g{p,Zi) - 1 



(20) 



< E 



1 + 



1 + 

2-p - 1 

EzJ(y(/^.Zi)]-l 

2-P - 1 



;E^,b(p,z2)]-i) 



_ ^ [P(^BEC - 0)(1 - 2-^ + 2-^ - 1] [P{Z^ - 0)(1 - 2-^ + 2-P - 1] 

2-^-1 

= 1 - P(Zbec = 0)P(Zbec = 0)(1 - 2-P) + (P(Zbec = 0) + P{Z^ = 0)) (1 - 2'^) + 2'^-! 

= [PiZsEC = 0) + PiZ^ = 0) - P(Zbec = 0)P(ZeEc = 0)] (1 - 2"^ + 2"'' 
= exp{-£;o(p,W^g-^^)}. 

This concludes the proof for the minus transformation. Now, we sketch the proof of the ex- 
tremality property for the plus transformation. We define the function h{p, Zi, Z2) as 

Mp, .1, Z2) + z,Z2)9{p, + 1(1 - Z,Z2)9{p, (21) 

where zi, Z2 G [0, 1], and p > 0. Note that h{p, zi, Z2) is symmetric in the variables zi, and Z2. 
The proof relies on the convexity result stated in the next lemma. The proof of the lemma is 
given in Appendix B. 

Lemma 6: [6] For any 2; e [0, 1], and p > 0, the function Hz^p{t) : [2-p, 1] [2-P,g{p,z)] 
defined as 



is concave with respect to the variable t when p e [0, 1] U [2, 00], and convex when p e [1,2]. 



The proof of the theorem for the plus transformation can be completed following similar steps 
to the minus case. By Lemma 3, we have 

E [h{p, Zi, Z^)] = exp{-£;o(p, W^i)}- 
We define the random variables 

and T^^g{p,Z2). 

Then, using the concavity of the function Hz^p{t) with respect to t for fixed values of p e 

[0, 1] U [2, oo], and 2; G [0, 1], we obtain the inequalities in (17): 

exp{-£;o(p, Wl^)} = E [H,-.^,,T,)ATi)\ < Hp, zssc, ^bsc) = exp{-£;o(p, W^+sc,bsc)>' 
and 

exp{-Eo(p, iy+ )} = E [if,-i(,,T,),p(Ti)] > 2-'^ + P{ZsEC = 0)P{Z^ = 0) (l - 2"'') 

= exp{-£;o(p,V^+^_^)}. 

Similarly, the convexity of the function Hz^p{t) with respect to i for p e [1, 2] leads to the reverse 
inequalities in (18). ■ 

C. Special p Values 

In Theorem 1, we have shown that among all B-DMC's W of fixed £^0(^1^^)^ the binary 
erasure channel's minus transformation results in a lower bound to any Eq{p, W~) and the binary 
symmetric channel's one in an upper bound to any Eq{p, W~). For the plus transformation, a 
similar extremality property holds except the difference that the result breaks into two parts 
depending on the value of the parameter p: While the binary erasure and binary symmetric 
channels appear on opposite sides of the inequalities for Eq{p, W~) and Eq{p, W^) when p G 
[0, 1] U [2, 00], they appear on the same side when p e [1,2]. Using these results, we identify in 
this section some special cases of p values to recover known, and discover new results. 

1) p — 0, Symmetric capacity: In [3], it is shown that the symmetric capacity is preserved 
under the basic polarization transformations. This property holds regardless of whether the 
combined channels are identical or not, as it is a consequence of the chain rule for mutual 
information. Namely, the channels satisfy: 

2I{W)^I{W-) + I{W+). 



This relation implies the process attached to the symmetric capacities of the synthesized channels 
is a bounded martingale, hence converges almost surely. 

Corollary 1: Under the assumptions as Theorem 1 with Wi = W2 = W, we have 

for p e [0, 1] 

Corollary 1 shows that amongst channels W with a given value of Eq{p, W) for a given p the 
BEC and BSC are the most and least polarizing under Ankan's polar transformations in the 
sense that their polar transforms W'^ and W~ has the largest and smallest difference in their 
£■0 values. Dividing all sides of the inequality above by p and taking the limit as p — > 0, we 
see that among channels of a given symmetric capacity, the BEC and BSC are extremal with 
respect to the polarization transformations, in the sense that 

This is a known argument proving the convergence is to the extremes of the [0, 1] interval. 
The preservation property of the symmetric capacities holds regardless of whether the combined 
channels are identical or not, as it is a consequence of the chain rule for mutual information. 
Namely, the channels satisfy: 

I{W,) + I{W2) = i{w,-,) + I{W^,2), 

and Theorem 1 can be used to show the convergence is also to the extremes values {0, 1} of 
the corresponding bounded martingale process. 

Remark 3: These inequalities for the symmetric capacities can also be obtained by the results 
on the extremes of information combining [7], together with the fact that symmetric capacity is 
preserved under the polarization transformations [3]. 

2) p—1, Cut-off rate, Bhatthacharyya parameter: Another result of [3] can be recovered by 
letting p = 1. In this case. Theorem 1 implies channels having equal cut-off rates satisfy 

Eo{l: W^Ec) < ^0(1, W-) < Foil, VFbsc), 

^0(1, W^b'sc) = ^0(1, W+) = Eoih W^b'ec)- 

Moreover, by the definition in Equation (3), the extremalities for the Bhattacharyya parameter 
are also obtained. Indeed, we know Z{W^) — Z{WY by [3]. 



3) p — 2: A previously unknown result is found by taking p = 2 in the theorem. Similar 
to the case p = 1, we observe the Eq parameter of the channels W~^, W^^q, ^bsc equal to 
each other. 



D. Generalizations of the Bhatthacharyya parameter 

In this section, we discuss a generalization to the definition of the Bhattacharyya parameter. 
We propose an extension motivated by the Eq parameter of BECs. Given a BEC VFbec with 
erasure probability ebeo we have 

Cbec — — 



2P-1 

We also know the Bhattacharyya parameter of a binary erasure channel satisfies Z{Wbec) = ^bec- 
This parameter provides tighter bounds than Eo{l,W) in [3], and is used in the subsequent 
analysis. This gives the idea to define a similar quantity to Z{W), referred as W), which 
reflects the dependence on the value of p 

Using the results we derived in the previous section, the next Corollary shows how Z{p, W) is 
affected by the basic channel transformations. 

Corollary 2: Given a B-DMC W, for any fixed value of p > 0, we define a binary symmetric 
channel W^bsc> and a binary erasure channel H^bec through the equality 

Zip, W) = Zip, Wbec) = Zip, Wbsc) 
Then for the W~ and polar transformations, we have 

Zip, W^sc) < Zip, W-) < Zip, Py^- c) = 2Z(p, Wbec) Zip, Wbec)', 
Zip, Wbec)' = Zip, ly+c) < Zip, W+) < Zip, W^+c) V p e [0, 1] U [2, oo], 

Zip,W+^^) < Zip,W+) < Zip,W+^^)^Zip,WBEc)' ype[l,2]. (22) 

III. Conclusions 

The extremal! ty of the BEC and BSC for polar transforms can be interpreted in the context 
of information combining. Theorem 1 shows that even if we change the measure of information 
from the customary mutual information to Eq the channels BEC and BSC still remain extremal. 



The results of the theorem also show the p — 1,2 values share a common property: One can 
recover the value of the parameter £'o(p, W^) from the value of Eq{p, W) without necessarily 
knowing the particular channel W . Finally, the extremality results of the theorem open up the 
possibility to apply the theory of channel polarization to combining arbitrary B-DMCs, the details 
of which will further be investigated in a future work. 
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Appendices 

In these appendices, we prove in part A Lemma 5, and in part B Lemma 6. For the proofs, 
we need the following lemma. 

Lemma 7: The function g{p,z) defined as 

(1 1 1 1 \ ^'^'^ 

for z e [0, 1], and p e R\ {— 1}, is a concave non-increasing function in 2; for p e {—00, —1) U 
[0, 00), and a convex non-decreasing function in 2; for p e (—1, 0]. 
Proof: Taking the first derivative with respect to z, we get 

dg(p,z) f 1 1.^ 1/-, 



>0 

As we have 

1-z 



< 1, 



1 + z 

for e [0, 1], the monotonicity claims follow by noting that when p e (—00, —1) U [0, 00): 



1 + p ~ \ \^ + zj J dz 

and when p G (—1, 0]: 



l + p~ \ \^ + zJ j ' dz 



Taking the second derivative with respect to z, we get 



dz^ 



(1 - Z-) Q(l + .)T^ + 1(1 - Z)^^^ ■ 



>0 



The convexity claims follow once more by inspecting the sign of — - — in different intervals, 

1 + p 

i.e. when p e (— oo, —1) U [0, oo): 



p d^qip.z) 



and when p e (—1, 0]: 



l + p- dz^ 



l + p - dz^ 



Appendix A 

Proof of Lemma 5: We prove that the function F^^pit) — g{p, zg~^{p, t)) defined in Equation 
19 is convex with respect to the variable t for fixed p > and 2; e [0, 1] values. Taking the first 
derivative with respect to t, we obtain 

- :9{p,zg {p,t)) 



dt dt 

g'{p, Z9~\pi,t)) 



z. 



g'{pi,9~^{pi:t)) 

We define u — g~^{p, t). Since g{p, u) is a non-increasing function in u when p > by Lemma 
7, so is g~^{p, t) in t. Hence we can check the convexity of Fz^t) with respect to the variable 
t, from the monotonicity with respect to u of the following expression: 

9'(P2, zu) 



9'{Pi,u) 

To simplify notation, we define 



(24) 



Then, by equation (23) 



1 — 7/ 

f(u) = (25) 

^(p,^) = (1 + > (26) 

/3(p,u) = (l-/(«)^)<0 (27) 



Similarly, 



dg{p,zu) , 

= zg [p, zu 



du 

za{p, zu)l3{p, zu), 



and (24) is given by 



j'{p2,zu) ^ ^ a{p,zu)(3{p,zu) ^^g^ 
g'(pi,u) a{p,u)/3{p,u) 



Now taking the derivative of (28) with respect to u, we get 

d a{p. zti)3{f). zu) 
du a{p,u)f3{p,u) 
a{p, zu)/3{p, zu) 



—z ■ 



u)P{p, u) 



>0 



fda{p,zu)/du ^dj3{p,zu)/du da{p,u)/du d(5{p,u)/du\ ^^^^ 
V a{p,zu) Pip,zu) a{p,u) f3{p,u) ) 

We can see that the sign of the expression inside the parenthesis in (29) will determine the 

monotonicity in u of the expression in (28). At this point, we note that 



da{p,u)/du ^ d/3{p,u)/du / da{p, zu) / du ^ d/3{p, zu)/du 



(30) 

z=l 



a(p, u) /3(p, u) \ a(p, zu) /3(p, zu) 

Moreover, we claim that the expression inside the parenthesis in the RHS of (30) is non- 
decreasing in z. As a consequence, Fz,p{t) is a concave function m u — g~^{p,t). Since u 
is decreasing in t, we have 

d'Fz,,{t) ^ d_ f g'{p2,zu) \ du ^ ^ 
dt"^ du \ g'{pi,u) J ^dt^~ 

We conclude that Fz^p{t) is a convex function with respect to variable t. 
In the rest of the appendix, we prove our claim. We have, 

= ^^/'M/Mh^(1 + f{zu)^.)^-' (31) 



dp{p, zu) 



du 1 + P 
where 

df{u) -2 



P-zf{zu)f{zu)^^-' (32) 



fin) 



du (1 + uY' 



Hence, 



da{p, zu)/du ^ d(3{p, zu)/du 



a{p, zu) 



/3{p, zu) 



^^f{zu)^.-'zf{ 
P 



f{zu) 



^ + 

^l + f{zu)^+p l-f{zu)^+p^ 

. ( f{zu) - f{zu)^p + 1 + f{zu)^p ^ 
f[zu)^+p zf [zu) ^ 

V il + f{zu)—p){l-f{zu)—p) , 

.-Pi .... .... 1 . 



1 + p 

p 



_ p 



^^f{zu)^p-\r{zu){\ + /M)(l + j{zu)^p)-\\ - f{zu)^p)-' 
-f-zf{zu){l + f{zu)-'){l + f{zu)^p)-\f{zu)^p - 

1 + p 



1 + p{i + zu)'^{i- zu) y \i 

( 



\ — zu\ 



+ zu 



-1 + 



_^^\ -1 
\ — zu\ i+p 



\^ zu 



1 + p 



\-z 



^u^ 



\ 



{{\ + zw) 1^ - (1 - zu)'^p \ Ul + zii) + (1 - zii) iT^;^ 

. V ' 

Part 2 Part 1 / 



We consider the expressions labeled as Part 1 and Part 2 separately. Note that both are positive 
valued. In addition, we will show that both are decreasing in z. As a result, we deduce 

— ( — — ( {\ ^ zu)'^p ^ {\ - zu)'^^ [{\ + uz)'^p - (1 - uz)'^^ < 



dz 



(^{l + zu)^+p +{1- zu)^+p^ (^(l + uz)^ -(l-uz)^^j <0 

— (^(1 + zu)^ + {1- zu)^^ (^{1 + uz)^ - (1 - uz)^yj > 



d_ f jl-z^u^) 

dz 

which is proves our claim 
For Part 1, we get 



■^(^{1 + zu)^+p + (1 - zu) 1+" j 
u ^(1 + uz)^ - (1 - uz)^^ 



1+p 



For Part 2, we have 

d_ 
dz 



I puz (1 - u'^z'^) (^{1 + uz)^~^ + (1 - uz)^p~^^ 

~ ^ 1 + p 

+ ^ (l + u^z'^) (1 + uz)^p +{1- uz)^^ 



=— (l + uz)^+p { —!—uz(l-uz) - (1 + u^z^) 

+ (1 - uz)^p (^Y^^^ ^"^ "^'^ + (1 + "^^^) 
=^ ( - (1 + x)' ((A; + l)x^ -kx + l) + {l- xf {{k +l)x^ + kx + l)^ 

=4( -fi{x,k) + f2{x,k)) (33) 
z^ 



where A; = e [0, 1), x = uz ^ [0, 1], and 

A;) = (1 + xf {{k + l)x^ -kx+1) , (34) 
f2{x, k) = {l- x)'' {{k + l)x^ + kx+l). (35) 

We will show that fi{x,k) > f2{x,k) holds for x e [0,1], and for k e [0,1). Since 

fi{x k) 

fi{x, k), f2{x, k) > 0, this is equivalent to showing that log ' > holds. We have 

hix, k) 

log^^^P^ = klogl^+log{{k + l)x^-kx + l)-log{{k + l)x'' + kx + l). 

j2[X-,k) 1 — X 

We immediately observe that when A; = we have the above sum equals to 0. Now, we will 
show that 

dk f2{x,k) 

Hence, this will prove our claim that fi{x, k) > f2{x, k) holds. 
Taking the first derivative with respect to k, we have 

d_. fi{x,k) _ 1+x 2x (1 + x^) 

dk °^/2(x,A;)" °^l-x {l + {k + l)x^f-{kxf' 
So, we will be done if 

^ 1 + X / 2\ 1 

log > 2x (1 + X ) max ^ ^. 

^1-x- ^ ^ ke[o,i) {l + {k + l)x^f - {kxf 



One can easily check that the expression in the denominator (1 + (A; + — {kxf is non- 

decreasing in k e [0, 1), hence the reciprocal is non-increasing in /c. As a result, the maximum 
is attained at /c = 0. Therefore, we only have to prove that 

l+x 2x{l + x'^) 2x 
log- > 



l-x- {l + x'^f (l + a;2) 

holds. But, we have 



log = 2x 1 + -x^ + -x* + -X*' + . . . >2x> 



l-x V 3 5 7 - + 

So, —fi{x, k) + f2{x, k) < holds for k e [0, 1) and x e [0, 1]. Consequently, Part 2 is also 
decreasing in z. This proves our claim that the RHS of (30) is non-decreasing in z. ■ 

Appendix B 

Proof of Lemma 6: In this Appendix, we show that the function Hz,p{t) = h{p, g^^{p, t), z) 
defined in Equation 6 is concave with respect to the variable t when p e [0, 1] U [2, 00], and 
convex otherwise when p e [1, 2], for any fixed z e [0, 1], and p > 0. 
Taking the first derivative with respect to t, we get 

A/T (+\ - h'{p,g'\p,t),z) 
dt''^^'^ 9'{p,g-\p,k)) ■ 

As we did in Appendix A, we define u — g~^{pi,t). Since g{p, u) is a non-increasing function 

in u by Lemma 7, so is g~^{pi,t) in t. Hence we can check the concavity of Hz^p-^{t) with 

respect to variable t, by verifying that 

h'ip,u,z) 

is non-decreasing in u. So, we check that 

d ( h'{p, u,z)\ _ h"(p, u, z)g'{p, u) - h'{p, u, z)g"{p, u)_ ^ ^ 



du V g'{p,u) J g'{p,uY 
Since the denominator is always positive, we only need to show that 

h"{p, u, z)g'{p, u) - h'{p, u, z)g"{p, u) > 0. (36) 

Moreover, we observe that h{p,u,0) — g{p,u). So, we can equivalently show the following 
relation holds: 

h"{p,u,z) ^ h"{p,u,0) ^^^^ 
h'{p, u, z) ~ h'{p, u, 0) 



We first apply the transformations 

u — tanh(/i;), z — tanh('u;) 

where k,w e [0, oo). For shorthand notation, let tanh(/c), tanh(w)) = h{p,k,w). Using 
these, we obtain 

h[p,k,w) = ^ — — — . 

2 cosn(/cj cosn(wj 

Then, 

dh{p, k, w) 

^^'■^ - -2tanh(A;) + — ^cosh(A;)x 



dh{p, k,w) 1 + p 



dk 



cosh(^(A; + w))P-^ + cosh(Y^(A; - w)y-'^ 



cosh(Y^(/c + w))''sinh(-r^A; — -rr'^-'^) + cosh(7^(/c — to))^ sinh(-^A; + -r^w] 



1 



(38) 



We note that the additive term —2 tanh(/i;), and the non-negative multiplicative factor cosh(/c) 

do not depend on w. Hence, we only need to show the term inside the parenthesis is smallest 

when evaluated at w = 0. For this purpose, we define the transformations 

k + w , k — w 

a = , b = 

l + p' l + p 

a + b a — b 

such that k — [1 + p) — - — , and w — {1 + p) — - — . The condition k,w > is equivalent to 

a > \b\. Using these transformations, the reciprocal of the term inside parenthesis in equation 

(38) becomes 

cosh(6)^-^ cosh(a) sinh(^p - ^) + cosh(a)^-'' cosh(fo) sinh(^p + ^) 
^ ~ cosh(a)i-^ + cosh(6)i-P ■ 

Therefore, the inequality given in (37) will hold iff 

ur u\ ^ u( a + b a + b i^^^^ + b a + b 

R{p, a, b) < R{p, , ) = cosh(^— ) smh(^— p). (39) 

We define 

/(p, a, b) = cosh( — ) sinh(p — ) [cosh(a)"^~'' + cosh(6)"'^^'^] 

1 / \^-n 1 /7\ • w a + b a — b. , /,m_„ . , ^ . , , a + b a — b. 
— cosn(a) ^cosn(o) smn(p — 1 — ) — cosn(o) ^ cosh(a) smn(p — — ). 



We note that f{p, a,b) > is equivalent to the inequaUty (39), which in turn is equivalent to 
the inequality (37). 

After simplifications, the function reduces to the following form: 

f{p, a, b) = sinh(^^) J(p, a, b) 



where 



J(p, a, b) = cosh(6)^ ''cosh(a — ^ ^ ) — cosh(a)^ ''cosh(6 — p^—^). 



Since for a > we have 

, ,a — b. 
sinh(^— ) > 

we only need to show that J(p, a, 6) > 0. 



We introduce the variables k' , and w' using a = k' + w', and b = k' — w' where k', w' e [0, oo). 
Then, we get 

J(p, k' + w',k'- w') = cosh( A;' - w'y-'' cosh( /c' - pk' + w') - cosh( A;' - pk' - w') cosh( A;' + w') ^'P. 
We note that J(p, k' + w', k' — w') = 0. Moreover, J(p, k' + w' , k' — w') is increasing in the 

fc'=0 

variable k': taking the first derivative with respect to k', we get 

d 

■KTjJip, k' + w', k' - w') = (1 - p) [cosh(A;' - w')"^ - cosh(A;' + w')'"] sinh((2 - p)k') > 

where the positivity follows from the fact that \k' — w'\ < \k' — u/\, thus cosli(A:' — w') < 
cosh(A;' + w'), and cosh(A;' — w')~f > cosh(A;' + w')"'', and from the fact that sinh(a;) > holds 
for Vx > 0. 



As a result, J(p, k' + w',k' — w') > as required, and we have shown that the inequality 
given in (37) holds. This concludes the proof. ■ 
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